Rough set and scatter search metaheuristic based feature selection for credit scoring
نویسندگان
چکیده
As the credit industry has been growing rapidly, credit scoring models have been widely used by the financial industry during this time to improve cash flow and credit collections. However, a large amount of redundant information and features are involved in the credit dataset, which leads to lower accuracy and higher complexity of the credit scoring model. So, effective feature selection methods are necessary for credit dataset with huge number of features. In this paper, a novel approach, called RSFS, to feature selection based on rough set and scatter search is proposed. In RSFS, conditional entropy is regarded as the heuristic to search the optimal solutions. Two credit datasets in UCI database are selected to demonstrate the competitive performance of RSFS consisted in three credit models including neural network model, J48 decision tree and Logistic regression. The experimental result shows that RSFS has a superior performance in saving the computational costs and improving classification accuracy compared with the base classification methods. 2011 Elsevier Ltd. All rights reserved.
منابع مشابه
A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملSolving feature subset selection problem by a Parallel Scatter Search
The aim of this paper is to develop a parallel Scatter Search metaheuristic for solving the Feature Subset Selection Problem in classification. Given a set of instances characterized by several features, the classification problem consists of assigning a class to each instance. Feature Subset Selection Problem selects a relevant subset of features from the initial set in order to classify futur...
متن کاملFuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection
Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...
متن کاملA Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملScatter Search for the Feature Selection Problem
The feature selection problem in the field of classification consists of obtaining a subset of variables to optimally realize the task without taking into account the remainder variables. This work presents how the search for this subset is performed using the Scatter Search metaheuristic and is compared with two traditional strategies in the literature: the Forward Sequential Selection (FSS) a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Expert Syst. Appl.
دوره 39 شماره
صفحات -
تاریخ انتشار 2012